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Abstract - In cognitive radio (CR) networks, the perceived reduction of application layer quality of ser- 
vice (QoS), such as multimedia distortion, by secondary users may impede the success of CR technologies. 
^ Most previous work in CR networks ignores application layer QoS. In this paper we take an integrated de- 

m ; 

sign approach to jointly optimize multimedia intra refreshing rate, an application layer parameter, together 

Ln . 

with access strategy, and spectrum sensing for multimedia transmission in a CR system with time varying 

O 

wireless channels. Primary network usage and channel gain are modeled as a finite state Markov process. 
. ^ With channel sensing and channel state information errors, the system state cannot be directly observed. 

X: 

We formulate the QoS optimization problem as a partially observable Markov decision process (POMDP). 
A low complexity dynamic programming framework is presented to obtain the optimal policy. Simulation 
results show the effectiveness of the proposed scheme. 
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1 Introduction 



Recent widespread acceptance of wireless applications has triggered a huge demand for radio spectrum. 
For many years radio spectrum has been assigned to licensed (primary) users. Most of the time, some 
frequency bands in the radio spectrum remain largely unoccupied by primary users. Spectrum usage mea- 
surements by the Federal Communications Commission (FCC) show that at any given time and location, 
most of the spectrum is actually idle. That is, the spectrum shortage results from the spectrum manage- 
ment policy instead of the actual physical scarcity of usable spectrum. Cognitive radio (CR), which has 
been introduced in [HJ, is considered as an enabling technology that allows unlicensed (secondary) users to 
operate in the licensed spectrum bands. This can help overcome the lack of available spectrum in wireless 
communications. CR is capable of sensing its surrounding environment and adapting its internal states by 
making corresponding changes in certain operating parameters B2l|3l|4ll3. The FCC in the United States 
began to consider more flexible use of available spectrum. The NeXt Generation program of the Defense 
Advanced Research Project Agency also aims to redistribute allocated spectrum dynamically. 

One important application of CR is spectrum overlay dynamic spectrum access (DSA), where sec- 
ondary users operate in the licensed band while limiting interference with primary users. Spectrum oppor- 
tunities are detected and used by secondary users in the time and frequency domain [[6] [71 [8]. An optimal 
spectrum sensing strategy is proposed in [9 1 to maximize throughput. A separation principle is established 
in IfTOl to decouple the design of the sensing strategy from that of the spectrum sensor and the access 
strategy. The benefits of cooperation in CR are illustrated in |fTTfl and [fT2l for two- and multi-user net- 
works, respectively. A dynamic frequency hopping scheme is presented in lTT3l for IEEE 802.22 wireless 
regional area networks, which is an emerging standard based on CR technologies. In [14J, the authors 
present a game theoretical dynamic spectrum sharing framework for analysis of network users' behaviors, 
efficient dynamic distributed design, and optimality analysis. Other game theoretic DSA methods are pre- 
sented in [fT5l [T6ll . The authors in ifTTl exploit channel availability in the time domain and demonstrate 
the throughput performance for a Bluetooth/WLAN system. Spectrum opportunity is also exploited in the 
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time domain in [18] where the authors present an ad hoc secondary MAC protocol to facilitate DSA. 

Although much work has been done in CR networks, most previous work considers maximizing the 
throughput of secondary users as one of the most important design criteria. As a consequence, other 
QoS measures for secondary users, such as distortion for multimedia applications, are mostly ignored in 
the literature. However, recent work in cross-layer design shows that maximizing throughput does not 
necessarily benefit QoS at the application layer for some multimedia applications, such as video |fT9ll20ll . 
From a user's point of view, QoS at the application layer is more important than that at other layers. 
Moreover, CR-based services for secondary users would have a strictly lower QoS than radio services 
that enjoy guaranteed spectrum access [|2T1 122). Therefore, if the application layer QoS is not carefully 
considered in CR networks, the perceived reduction in QoS associated with CR may impede the success 
of CR technologies. 

Multimedia applications such as video telephony, conferencing, and video surveillance are being tar- 
geted for wireless networks, including CR networks. Lossy video compression standards, such as MPEG-4 
and H.264, exploit the spatial and temporal redundancy in video streams to reduce the required bandwidth 
to transmit video. Compressed video comprises of intra- and inter-coded frames. The intra refreshing rate 
is an important application layer parameter [23]. Adaptive ly adjusting the intra refreshing rate for online 
video encoding applications can improve error resilience to the time varying wireless channels available 
to secondary users in CR networks. 

Cross-layer wireless multimedia transmission, where parameter optimization is considered jointly 
across OSI layers, has been well studied in the literature [|24ll25ll26ll27ll28l . Recent work shows promis- 
ing improvement to video QoS by considering resource management, adaptation, and protection strategies 
available at the physical, medium access control, and network/transport layers in conjunction with mul- 
timedia compression and streaming algorithms [|29l l30l I3T1 l32l l33l l34l l35l . Various channel adaptive 
distortion driven cross-layer transmission strategies have been explored. The authors in [|36l investigate 
a classification based system where the optimal cross-layer strategy for various video and channel con- 
ditions are computed offline thereby reducing the transmission-time complexity of the compression and 
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transmission strategy. Within a rate-distortion framework, source coding, retransmission, and adaptive 
modulation parameters are jointly considered for video summary in [|37l . The authors in [38] take a cross- 
layer approach to allocate power level, source coding rate, and channel coding rate delivering basic and 
enhanced QoS levels for distant and near receivers in a CDMA network. 

Although there are some cross-layer design techniques for wireless multimedia transmission in the lit- 
erature, little work investigates channel adaptive multimedia transmission over a cognitive radio network. 
In this paper, we take an integrated design approach to jointly optimize application layer QoS for multime- 
dia transmission over cognitive radio networks. Based on the sensed channel condition, secondary users 
can adapt the intra refreshing rate at the application layer, in addition to the parameters at other layers. 
Some distinct features of the proposed scheme are as follows. 

• For secondary users in CR networks, channel selection for spectrum sensing, access decision, and 
intra refreshing rate are determined concurrently to maximize the QoS at the application layer (i.e., 
minimize distortion for video applications). 

• Physical layer channel state information (CSI) (channel gain) is used by secondary users to help 
make the optimal decision to maximize the application layer QoS. 

• Primary network usage and channel gain are modeled as a finite state Markov process. With channel 
sensing and CSI errors, the state cannot be directly observed. Following the work in |[T0l . we 
formulate the whole system as a partially observable Markov decision process (POMDP) [39J. We 
extend the scheme to jointly optimizing application layer QoS for multimedia transmission over 
cognitive radio networks. 

• Using simulation examples, we show that application layer parameters have significant impact on 
the QoS perceived by secondary users in CR networks. We also show that application layer QoS 
can be improved significantly if the intra refreshing rate is adapted together with parameters at 
low layers, such as spectrum sensing. This study reveals a number of interesting observations and 
provides insights into the design and optimization of CR networks from a cross-layer perspective. 



The rest of the paper is organized as follows. Section II describes the multimedia transmission over 
CR networks problem. Section III presents the proposed scheme. Some simulation results are given in 
Section IV. Finally, we conclude this study in Section V. 

2 Multimedia Transmission over Cognitive Radio Networks 

In this section, we describe the multimedia rate-distortion model used in this paper. We then present the 
system model for multimedia transmission over cognitive radio networks. 

2.1 Rate-Distortion (R-D) Model for Multimedia Applications 

Wireless channels have limited bandwidth and are error-prone. Highly efficient coding algorithms such 
as H.264 and MPEG-4 can compress video to reduce the required bandwidth for the video stream. Rate 
control is used in video coding to control the video encoder output bit rate based on various conditions to 
improve video quality 11401 . For example, the main tasks of MPEG-4 object-based video coding are (1) to 
determine how many bits are assigned to each video object in the scene and (2) to adjust the quantization 
parameter to accurately achieve the target coding bit rate PTTl . 

Highly compressed video data is vulnerable to packet losses where a single bit error may cause severe 
distortion [|42ll43l . This vulnerability makes error resilience at the video encoder essential. Intra update, 
also called intra refreshing, of macroblocks (MBs) is one approach for video error resilience and protection 
ll44l . An intra coded MB does not need information from previous frames which may have already been 
corrupted by channel errors. This makes intra coded MBs an effective way to mitigate error propagation. 
Alternatively, with inter-coded MBs, channel errors from previous frames may still propagate to the current 
frame along the motion compensation path [|45l . 

Given a source-coding bit rate R s and intra refreshing rate, we need a model to estimate the corre- 
sponding source distortion D s . The authors in [23J provide a closed form distortion model taking into 
account varying characteristics of the input video, the sophisticated data representation scheme of the cod- 
ing algorithm, and the intra refreshing rate. Based on the statistical analysis of the error propagation, error 
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concealment, and channel decoding, a theoretical framework is developed to estimate the channel distor- 
tion, D c . Coupled with the R-D model for source coding and time varying wireless channels an adaptive 
mode selection is proposed for wireless video coding and transmission. 

We will use the rate-distortion model described in [|23l in our study. The R-D model facilitates adaptive 
intra-mode selection and joint source-channel rate control 

The total end-to-end distortion comprises of D s , the quantization distortion introduced by the lossy 
video encoder to meet a target bit rate, and D c , the distortion resulting from channel errors. For DCT- 
based video coding, intra coding of a MB or a frame usually requires more bits than inter coding since 
inter coding removes the temporal redundancy between two neighboring frames. Let (3 be the intra re- 
freshing rate, the percentage of MBs coded with intra mode. Inter coding of MBs has much better R-D 
performance than intra mode. Decreasing the intra refreshing rate decreases the source distortion for a 
target bit rate. However inter coding relies on information in previous frames. Packet losses due to chan- 
nel errors result in error propagation along the motion-compensation path until the next intra coded MB 
is received. Increasing the intra refreshing rate decreases the channel distortion. Thus we have a tradeoff 
between source and channel distortion when selecting the intra refreshing rate. We aim to find the optimal 
(3 to minimize the total end-to-end distortion given the channel bandwidth and packet loss ratio. 

The source distortion is given by 



where R s denotes the source coding rate, (3 is the intra refreshing rate, and rj is a constant based on the 
video sequence. D s (R Sl 0) and D S (R S , 1) denotes the time average all inter- and intra-mode selection for 
all frames over K time slots. 



D a (R a , (3) = D S (R S , Q) + (3(l-rj + r}P)[D s (R Sl 1) - D S {R S , 0)], 



(1) 




(2) 




(3) 
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where M k is the number of inter/intra frames in time slot k. The average channel distortion for each time 
slot is given by 

D ^)={t^Tw){t^) m(m,m-l)], (4) 

where p is the packet loss rate, b is a constant describing motion randomness of the video scene, a is the 
energy loss ratio of the encoder filter, and E[Fd(m, m — 1)] is the average value of the frame difference 
F d (m, m—1) over K slots. We will use the same error concealment strategy and packet loss ratio derivation 
as described in [|23l . 

The total average distortion is given by 

D(R s ,p,p) = D s (R s ,f3) + D c (p,p). (5) 

The optimum f3* is then selected to minimize the total distortion. 

P* = argmm D{R s ,p,/3). (6) 

2.2 System Model 

Consider a spectrum that consists of TV channels, each with bandwidth W(n), 1 < n < N . These N 
channels are licensed to primary users. Time is divided into slots of equal length T. Slot k refers to the 
discrete time period [kT, (k + 1)T]. 

When the slot is not in use by primary users, it will be comprised of AWGN noise and fading. The 
fading process and primary usage for a channel can be represented by a stationary and ergodic .S-state 
Markov chain. Let i and 7 denote the instantaneous channel state and fading gain, respectively. When 
the channel is in state i, the quantized fading gain is 7^, where ji < 7 < ji+i, 1 < i < S — 1. When 
the channel is in state i = S, the channel is in use by the primary network. We assume that the phase 
of the channel attenuation can be perfectly estimated and removed at the receiver. The S'-state Markov 
channel model is completely described by its stationary distribution of each channel state i, denoted by 
p(i), and the probability of transitioning from state i into state j after each time slot, denoted by {Pij}, 
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l<i,j<S. 

In general, a finite state Markov channel (FSMC) model is constructed for a particular fading distribu- 
tion by first partitioning the range of the fading gain into a finite number of sections. Then each section of 
the gain value corresponds to a state in the Markov chain. The application of FSMC to model Rayleigh 
channels has been well studied in Il46ll47ll . Given knowledge of the fading process and primary network 
usage, the stationary distribution p(i) as well as channel state transition probabilities {P%j} can be derived. 
Once a channel gain has been determined for states 1,2,..., S-l, the packet loss ratio is determined for each 
state based on the modulation and channel coding schemes. The intra refreshing rate that minimizes the 
total distortion for each state can then be calculated using the Rate-Distortion model. 

At the beginning of a slot, the transmitter of secondary users will select a set of channels to sense. 
Based on the sensing outcome, the transmitter will decide whether or not to access a channel. If the 
transmitter decides to access a channel, some application layer parameters will be selected and the video 
content will be transmitted. At the end of the slot, the receiver will acknowledge the transfer by sending 
the perceived channel gain back to the transmitter. We will assume a system for real-time multimedia 
applications where packets are discarded if a primary user is using the slot or if the channel is not accessed. 
The system block diagram showing video transmission between two secondary users is shown in Fig. CD 

3 Solving the Multimedia Transmission over Cognitive Radio Net- 
works Problem 

In CR networks with multimedia applications, we need to determine the optimal policy for channel sensing 
selection, sensor operating point, access decision, and intra refreshing rate to minimize application layer 
distortion subject to the system probability of collision. With channel sensing and CSI errors, the system 
state cannot be directly observed. Following the work in [10J, we formulate the whole system as a partially 
observable Markov decision process (POMDP). Deriving a single POMDP formulation for all policies 
under the probability of collision constraint would result in a constrained POMDP. However, constrained 
POMDPs require randomized policies to achieve optimality, which is often intractable. Therefore, we use 
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the separation principle in [[TOl for the sensor operating point and the access decision. The spectrum sensor 
operating point is set such that 5 = £, where 5 is the probability of miss detection of the busy channel used 
by primary users and £ is the required probability of collision. 

At the beginning of the slot, the system transitions to a new state. Using a POMDP derived policy, a 
channel is selected for spectrum sensing. An access decision is then made based on the sensing observa- 
tion. Using the belief of the channel state, an intra refreshing rate is selected. The receiver acknowledges 
the transfer by sending the quantized perceived channel gain back to the secondary transmitter. The im- 
mediate cost for the time slot is derived based on the previous operations in the slot. 

The system can be formulated as a POMDP with states, actions, transition probabilities, observations, 
and cost structures as follows. 

3.1 State Space, Transition Probabilities and Observation Space 

The system state is given by the network usage of primary users and channel state information. Let {X (n) } 
denote an S'-state Markov chain for channel n, X{n) 6 X = {ei, e 2 , es-i, e^}, where denotes the 
^-dimensional unit vector with 1 in the ith position and zeros elsewhere. The system with TV channels is 
modeled as a discrete-time homogeneous Markov process with S N states. The system state in time slot k 
is given by V k = [X k {l), X k {N)\. 

To simplify the presentation, we consider a system with a single channel in the formulation. It is 
straightforward to extend the formulation to include multiple channels which is considered in our simula- 
tions. For the system with a single channel, Vk = X k . The transition probabilities of the system state are 
given by the S x S matrix A. We assume the transition probabilities are known based on network usage 
and channel fading characteristics. 

The observation available to the secondary transmitter and receiver is the sensed channel and channel 
gain acknowledgment, Y k 6 Y, where Y = {71, 75-1, 75 (The channel is used by primary users)} and 
ji < 7j,Vi < j. 

The spectrum sensor observation may be different at the transmitter and receiver. If the transmitter 
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and receiver use the same observations to derive the information state (described in the following Sub- 
section), then the information state can be used to maintain frequency hopping synchronization. Thus the 
information state will be updated with Y k and will not include the spectrum sensor observation. 

Let B(y, x, a) = Pr{y\x, a} denote the conditional probability of observing y given that the system 
state is in state x and composite action a was taken. 



B(y,x,a) = < 



(7) 



P ce (x,v(y))(l - e), if y 7^ 75, x ^ e s , 
e, if y = 7s,x 7^ es 

0, ify^ 1S:X = e s 

1, if y = ls ^x = e s 

where e is the probability of miss detection of the idle channel and v(y) = i, 1 < i < S given y = 7$. 
When the channel is available and accessed, the probability of channel estimate by the receiver is given by 
P ce (x,v(y)) . 

Using the work from Hoang and Motani [[48| , we assume the channel estimation error has a Gaussian 
distribution with zero mean and a 2 variance. At a particular time and channel, the estimated channel gain 



is 



7 = % + w, 



(8) 



where 7$ is the actual channel gain and w is a Gaussian random variable with zero mean and a 2 variance. 
The receiver then quantizes the channel gain to the nearest possible value. The probability that 7 is closest 
to 7j is given by 



P ce (i,j) 



{^m^)- erf (^w 



1 - erf 



7S-2+7S-i- 2 7i 



> if 3 ^ ei, e5_1.es 
if j = e x 
if j = es_i 
if j = e s 



(9) 



where erf () denotes the error function. 
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3.2 Information State 

Information state is an important concept in POMDP. We will refer to a probability distribution over states 
as the information state and the entire probability space (the set of all possible probability distributions) 
as the information space. The information spaces for 2-state and 3-state systems are shown in Fig. |2] For 
a system with two states, its information space is a one-dimension line. The distance from the right end is 
the first component 7r(l) and the distance from the left end is the second component 7r(2). For the system 
with 3 states, its information space is a two-dimension triangle. The value of a point in the information 
space can be obtained from the perpendicular distance from the sides of the triangle. An information state 
is a sufficient statistic for the decision and observation history. 

3.3 Action Space 

Due to hardware limitations, we will assume that a secondary user is equipped with a single Neyman- 
Pearson energy detector and can only sense L = 1 channel at each time instant. In each slot k, the 
secondary user needs to decide whether or not to sense, determine which sensor operating point on the 
Receiver Operating Curve (ROC) curve to use, whether to access the channel, and which quantized in- 
tra refreshing rate to use. Thus the action space consists of four parts: a channel selection decision 
a s (k) G {0(no sense), l(sense)}, a spectrum sensor design (e(A;), 6(k)) G A e $ where A e $ are valid points 
on the ROC curve, an access decision a a (k) G {0(no access), l(access)}, and an intra refreshing rate 
(5(k) G hp. The composite action in slot k is denoted by a k = {a s (k),(e(k),5(k)),a a (k), /3(k)} G 
({0,1}, ^,{0,1}^). 

Due to sensing and channel estimation errors, a secondary user cannot directly observe the true system 
state. It can infer the system state from its decision and observation history encapsulated by the information 
state. Information state ir k = {X x (k)} x& x G n(X) where A^(A;) G [0, 1] denotes the conditional probability 
(given decision and observation history) that the system state is in x G X at the beginning of slot k prior 
to state transition. I1(X) = {\ x (k) G [0, 1], XLex ^ = 1} denotes the information space that includes all 
possible probability mass functions on the state space X. 
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At the end of the time slot, the transmitter receives observation Y k . The information state is then 
updated using Bayes' rule before state transition 



J2 x 'ex ^x'(k)A x > :X B(y k , x k , a k ] 
Z) x ex Z) x 'ex K'(k)A x ^ x B(y k , x k , a k ) 



A fe+i — ^ ^ ; — — —, r- v 1 ^; 



Given information vector i\ k the distribution of the system state X k in slot k after state transition is 
then given by 

Vx{X k = x} = J2 K'(k)A x ,, x Vx G X. (11) 

3.4 Cost and Policy 

From a user's point of view, QoS at application layer is more important than at other layers. Therefore, 
we model multimedia distortion as the immediate cost in our scheme. The immediate cost in time slot k 
is defined as 

C k = D(R,p{x k ,a k ),0(k)), (12) 

where R is the target bit rate and p(x k , a k ) denotes the packet loss ratio when the system is in state x k and 
composite action a k is taken in time slot k. We assume a a (k) = 0(no access) is the equivalent to 100% 
packet loss. 

The expected total cost of the POMDP represents the overall distortion for a video sequence transmitted 
over K slots and can be expressed as 



K 



J2D(R,p(x k ,a k ),(3(k)) 



k=l 



(13) 



where E^,//^,^,^} indicates the expectation given that policies fi s , /j, e s, fJ, a , p,p are employed. 

A channel sensing policy fj, s specifies a channel to sense, a s . A sensor operating policy fi e s specifies a 
spectrum sensor design (e, 5) G A e s based on the system tolerable probability of collision, £. An access 
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policy n a specifies the access decision a a G {0, 1}. An intra refreshing policy specifies the intra 
refreshing decision j3 & Ap based on the current information state rc k . 

3.5 Objective and Constraint 

We aim to develop the joint design of an optimal policy for multimedia transmission over CR networks, 
{/j,*,(jl* s , fj,*,fjL*A, that minimizes the expected total distortion in K slots under the collision constraint P c . 

K 



^ D(R,p(x k , Qfc), p(k)) 



fc=l 



(14) 



subject to 

P c (k) = Pr{a a (k) = l\X k = e s }<(,VkeK. (15) 

3.6 Value Function 

Let Jfc(7r) be the value function that represents the minimum expected cost that can be obtained starting 
from slot k (1 < k < K) given information state -Kk at the beginning of slot k. Given that the secondary 
user takes action a k and observes acknowledgment Y k = y k , the cost that can be accumulated starting 
from slot k consists of the immediate cost C k = D(R,p(x k , a k ), (3(k)) and the minimum expected future 
cost Jfc + i(7r + 1). 7r fc+1 = {X x (k + l)} xe x = U(n k \a k ,y k ), which represents the updated knowledge of 
system state after incorporating the action a k and the acknowledgment y k in slot k. The sensing policy is 
then given by 



J k (n k ) = mm\^y^X x >(k)A x > )X y^B(y k ,j,a k )[D(R,p(x k ,a k ),/3(k)) 

+J hH {U(ir k \a k ,y k ))],l<k<K-l (16) 



Jk{k k ) = min^ ^ \ X >{K)A X ^ X 



xex x'ex b'= e i 



^2 B (yK,j,a K )D(R,p(x K ,a K ),/3(K)) 



(17) 



The value function of an unconstrained POMDP with finite action space is piecewise-linear convex 
and can be solved using linear programming techniques [49]. An excellent overview of computationally 
efficient algorithms are given in [39] and can be used to solve for the optimum sensing policy. In general, 
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the number of linear segments that characterize the value function can grow exponentially. In 1991, 
Lovejoy proposed an ingenious suboptimal algorithm for POMDPs [50]. Based on Lovejoy's algorithm, 
the value function can be upper and lower bounded and efficient suboptimal solutions can be developed as 
in Subsection V-D of ilBTIl . By considering only a subset of the piecewise linear segments that characterize 
the value function and discarding the other segments, one can reduce the computational complexity. Due 
to the space limitation, please refer to Subsection V-D of fl5TTl for details. Moreover, solving the POMDP 
can be done off-line during system initialization. During the real-time multimedia transmission, a node 
just needs to find the value for specific information state according to (TT6T ) and update the information 
state according to (flOl) . which introduces little computational complexity. Finally, by imposing structural 
assumptions on the transition probabilities, cost and observation probabilities, one can prove in some cases 
that the optimal policy is a threshold policy [1521 . 

3.7 Intra Refreshing Strategy 

For a selected channel, the optimum (3 selected corresponds to the most likely available state based on n k . 
Due to the asymptotic nature of the channel distortion, a busy or unaccessed channel has infinite distortion. 
In this case, (3 has no influence on the total distortion. If the most likely state based on 7r^ corresponds to 
a busy state then the optimum (3 is to select a (3 corresponding to the most likely available state. That way 
if the information state suggests the channel is busy but in reality it is available then a (3 has been selected 
that will minimize the effect of this error. 

4 Simulation Results and Discussions 

In order to evaluate the performance of our proposed scheme, we have carried out a set of simulation 
experiments using the ns-2 simulator. All simulations were run on a computer equipped with Window 7, 
Intel Core 2 Duo P8400 CPU (2.26 Ghz) and 4GB memory. The choice for the total time slot number K 
in the dynamic programming depends on the convergence rate of the POMDP program. State transition 
probabilities, observation probabilities and value functions have effects on the convergence rate [1391 . [49]. 
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In our simulations, the POMDP program was run over a horizon of K = 200. It is reasonable to use 
K = 200 to approximate the problem with infinite horizon. We first consider a system with one channel 
in Subsections 4.1 - 4.3. Then, we consider a system with two channels in Subsections 4.4 - 4.5. In all 
figures the curves represent the average values, while the error bars represent the confidence intervals for 
95 percent confidence for 50 different instances (seeds). 

We consider the system performance in the following four cases: (1) using perfect knowledge of 
the system thus making optimal decisions, which is the best case possible, (2) making decisions based 
on the most likely state indicated by the information state, which is our proposed scheme, (3) making 
decisions solely based on the channel gain provided in the last acknowledgment, and (4) using a constant 
(3, which represents existing schemes that do not consider application layer QoS. Our goal is to compare 
the distortion of different schemes as opposed to determining the absolute distortion. We use an average 
distortion metric that refers to the average distortion over the time slots when the channel is available and 
accessed. Video rate-distortion parameters remain constant for the duration of the simulation. The same 
distortion parameters are used for all simulations. D S (R S , 0) = 74. D S (R S , 1) = 124. rj = 1.4. a = 0.01. 
b = 1.0. E[F d (m, m-l)] = 100. 

4.1 Performance Improvement 

Fig. [3] shows the distortion of different schemes. The number of states refers to S — 1 quantized channel 
gains and one busy channel state. For simplicity we derive a transition matrix based on the probability that 
any available state stays in the same state, Pr{X fc+1 = v\X k = v}, the probability of transitioning from 
an available state to a busy state, Pr{X fe+1 = z\X k = v}, and the probability of a busy state staying busy, 
Pr{X k+1 = z\X k = z}, Vv G {ei, e 2 , e s _i}, z = e s , where v and z indicate available and busy states, 
respectively. The following parameter values are used in this example. Pr{X k+ i = v\X k = v} = 0.85, 
Pr{X k+1 = z\X k = v} = 0.05, Pr{X fe+ i = z\X k = z} = 0.1, e = 0.6, a = 0.1. From Fig. gj 
we can see that when perfect knowledge of the channel state is available, perfect decisions can be made 
for each time slot thus method (1) has the lowest average distortion. The more realistic cases occur 
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in the presence of sensing and CSI errors. Our proposed method (i.e., method 2) uses the information 
state to select the most likely optimal decisions. This method tracks the ideal case fairly closely. Both 
method 3 and method 4 have worse performance compared to the proposed scheme. This illustrates the 
performance improvement of the proposed scheme over existing schemes. In addition, we also notice 
that using a constant (3 (i.e., method 4) can be worse than making decisions based solely on the previous 
acknowledgment (i.e., method 3), which shows the need to consider application layer parameters and 
application QoS. Moreover, increasing the number of channel states changes the characteristics of the 
channel. Consequently, the likelihood that the underlying system is in a state where the constant (3 is 
optimal decreases. Therefore, the performance of using the constant f3 is not stable with the increasing of 
the number of states. The application layer parameter, f3, should be adapted together with parameters at 
low layers. 

4.2 Effects of the Parameters in the State Transition Matrix 

We evaluate how the parameters in the transition matrix affect the average distortion. The transition matrix 
can be selected based on channel fading and primary usage. We ignore quantization errors caused by the 
limited number of states and assume the actual channel gain matches the state channel gain. Fig. [4] and 
Fig. Oshow the simulation results across Pr{Xk+\ = v\Xk = v} and Pr{Xfe + i = z\X k = v}, respectively. 
In Fig. |4l there are 5 states, e = 0.6. Pr{Xk+i = z\Xk = v} = 0.05. This example demonstrates the 
cognitive nature of the system. Our proposed method (i.e., method 2) approaches the method of using 
perfect knowledge of the channel state as Pr{X fc+1 = v\X k = v} approaches 1. That is, the performance 
improves as the system dynamics slows down since it is easier to predict the actual system state. 5 states 
are used in Fig. |5] e = 0.6. Pr{X fc+1 = v\X k = v} = 0.50. From this figure, we can see that 
Pr{Xk+i = z\Xk = v} has little impact to the performance of the proposed method. The reason for this 
observation is that increasing Pr{Xk+i = z\Xk = v} will increase the likelihood the system transitions to 
the busy state, which has little affect on the average distortion when the channel is available and accessed. 
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4.3 Effects of the Parameters in the Observation Matrix 

The observation matrix is derived from the sensor operating point, e, and the standard deviation of the 
receiver channel estimation error, a. Fig. [6] and Fig. [7] show how a and e affect the average distortion. 
The following parameters are used in Fig. |6] There are 5 states, e = 0.6, Pr{X fc+1 = v\X k = v} = 0.85, 
and Pr{Xk+i = z\X k — v } = 0.05. We can see from Fig. |6l as the receiver estimation degrades, the 
acknowledgment provides less information on the actual channel gain and the average distortion of our 
method increases, e and 5 are related based on the sensor ROC, and adjusting e implies a change to the 
system probability of collision requirement. In Fig. [7] Pr{Xk+i = v\X k = v} = 0.85,Pr{Xfe + i = z\X k = 
v} = 0.05, and a = 0.1. This figure shows that the average distortion increases as the probability of false 
alarm increases. 

4.4 Effects of the Transition Matrix on Channel Selection Policy 

We consider a system with N = 2 channels and S = 3 states to evaluate the performance of the channel 
selection policy. We will use a spectrum utilization (SU) metric to evaluate the sensor policy performance. 
SU represents the percentage of time slots where an available channel was selected for sensing. SU is 
an important parameter when evaluating video QoS. The channel distortion is infinite when a channel is 
busy or not accessed. Improving the SU will reduce the percentage of time slots where a busy channel 
was selected for sensing thus improving the application layer QoS. The application layer QoS is improved 
using a two step process. First we select a channel to maximize SU thus reducing the large distortion 
introduced when the channel is unavailable. Second for an available and accessed channel, we select the 
intra refreshing rate to minimize distortion for a particular channel gain. 

The two channels, channel 1 and channel 2, are simulated having the same number of states (i.e. 
quantized channel gains) and observation probabilities but asymmetric transition probabilities. Channel 2 
will have a higher primary usage than channel 1 . Based on previous observations, actions, and the POMDP 
derived policy the secondary transmitter/receiver pair dynamically selects the channel that will most likely 
maximize application layer QoS. 
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We evaluate SU and average distortion performance for three cases (1) POMDP channel selection, 
which is our proposed scheme, (2) randomly selecting channel 1 or 2 and using a constant /5 = 0.1, which 
represents a non-adaptive scheme, and (3) using perfect knowledge of the system state, which represents 
the ideal case. 

SU performance with varying transition matrix parameters is shown in Fig. [8] and Fig. [9J In both plots 
we only vary the transition matrix parameters of channel 1. Both channels have equal observation matrix 
parameters e 1 = e 2 = 0.62 and a 1 = a 2 = 0.1. 

In Fig. [8]we vary the probability channel 1 stays busy, Pr{X^ +1 = z\X\ = z}. Pr{Xl +1 = z\X\ = 
v} = 0.2. Pr{X 2 +1 = z\Xl = z} = 0.8. Pr{X 2 +1 = z\X\ = v} = 0.6. In Fig. @ we vary the 
probability channel 1 transitions to the busy state, Pr{Xl +1 = z\X\ = v}. Pr{Xl +1 = z\X\ = z} = 0.4. 
Pr{X| +1 = z\X\ = z} = 0.8. Pr{X| +1 = z\X\ = v} = 0.6. 

In both cases, the SU utilization of our scheme is greater than the non-adaptive scheme. Our proposed 
scheme senses the surrounding environment to learn and adapt channel selection. However it takes several 
time slots for the policy to learn the system state thus the performance of our scheme improves with 
slower transition dynamics. That is, our scheme approaches the perfect case as Pr{X^ +1 = z\X\ = v} 
approaches as is shown in Fig. |U Our scheme provides closer to optimal performance when there is a 
large difference in channel availability between the two channels as it becomes easier to distinguish the 
better channel. This is demonstrated in Fig. [8]where the performance of our scheme is more optimal at low 
Pr{X^ +1 = z\X\ = z] relative to Pr{X| +1 = z\X\ = z}. In Fig. [GJ we show the average distortion for 
the probability channel 1 stays busy. The average distortion of our scheme is better than the non-adaptive 
scheme. Transition matrix parameters have little affect to the average distortion. Our scheme outperforms 
the non-adaptive scheme because our scheme will select the channel with the better channel gain and adapt 
the intra refreshing rate for the selected channel. 
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4.5 Effects of the Observation Matrix on Channel Selection Policy 

SU with varying sensor operating point is shown in Fig. QT] Pr{Xl +1 = z\X\ = z} = 0.4. Pr{Xl +1 = 
z\X\ = v} = 0.15. Pr{X 2 +1 = z\Xl = z] = 0.6. Vr{X% +l = z\X\ = v} = 0.2. Observation parameters 
are derived by operating characteristics of the secondary users and are not likely to be different for each 
channel. Thus both channels are simulated with symmetrical observation parameters, e 1 = e 2 = e and 
a 1 = a 2 = a. In Fig. [TT]we vary the spectrum operating point e. In Fig. [[2]we show the average distortion 
with varying e. The observation parameters are shown to have little affect on the SU and average distortion 
performance of our proposed scheme. 

These simulation results demonstrate some interesting trends in the design and optimization of CR 
networks from a cross-layer design perspective. Adaptively adjusting the intra refreshing rate to accom- 
modate time varying wireless channels is an effective way to reduce distortion. By using all previous 
actions and observations we can build an information state that becomes more accurate over time. Perfor- 
mance of using the information state to select the intra refreshing rate improves as the system dynamics 
slows down. In a CR environment the MAC access strategy is derived from the accuracy of the spec- 
trum sensor. The total distortion is limited to the availability of the channel. Distortion performance will 
degrade if primary usage increases or a very low system tolerable probability of collision is required. 

5 Conclusions and Future Work 

In this paper, we have presented an integrated approach for multimedia transmission over cognitive radio 
networks. An important application layer parameter, intra refreshing rate, can be adjusted together with 
other parameters at other layers based on the sensed channel condition by the secondary users. A low 
complexity dynamic programming framework was presented to obtain the optimal intra refreshing policy. 
By modeling the system as a Markov process, we have derived a POMDP for optimal channel selection 
to minimize distortion while improving spectrum efficiency. Simulation results demonstrated the perfor- 
mance gain by using the adaptive transmission scheme. Future work is in progress to consider other QoS 
at the application layer. 
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Figure 1: The Block diagram of multimedia transmission over cognitive radio networks. 




Figure 2: Information state in POMDP. 
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Figure 3: Average distortion vs. the number of states in different schemes. 
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Figure 4: Average distortion vs. the probability of staying in the same state. 



24 



Average Distortion with Probability of Transitioning to Busy State 
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Figure 5: Average distortion vs. the probability of transitioning to the busy state. 
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Figure 6: Average distortion vs. the receiver channel estimation standard deviation, a. 
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Figure 7: Average distortion vs. the sensor operating point, e. 
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Figure 8: Two Channel Scenario: Spectrum utilization vs. the probability of staying in the busy state of 
channel 1. 
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Figure 9: Two Channel Scenario: Spectrum utilization vs. the probability of transitioning to the busy state 
of channel 1 . 
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Figure 10: Two Channel Scenario: Average distortion vs. the probability of staying in the busy state of 
channel 1. 
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Figure 11: Two Channel Scenario: Spectrum utilization vs. the receiver channel estimation standard 
deviation, a. 
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Figure 12: Two Channel Scenario: Average distortion vs. the receiver channel estimation standard devia- 
tion, a. 
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